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SUMMARY 

Sequences encoding the N protein of the bovine enteritic coronavirus~F15 
strain (BECV-F15) have been cloned in PBR322 plasmid using cDNA pro¬ 
duced by priming with oligo-dT on purified viral genomic RNA. Some 265 
insert-containing clones were studied. Hybridization of these inserts with po- 
ly(A) + RNA extracted from infected cells led to the conclusion that they were 
located at the 3’-end of the genome. 

After subcloning in M13 phage DNA, clones were sequenced by the Sanger 
technique. A 1,710-nucleotide sequence corresponding to the gene coding for 
the viral N-protein was established. It shows 2 overlapping open reading frames 
(ORF). The 3’-non-coding end of the gene has an 8-nucleotide sequence in 
common with the homologous genome areas of MHV, TGE and IBV viruses. 
This sequence may represent the polymerase RNA binding site. 

An upstream sequence surrounding the first AUG of the smaller ORF cor¬ 
responds to a potentially functional initiation codon. The sequence of the 
primary translation product deduced from the DNA sequence predicts a 
polypeptide of 207 amino acids (22.9 Kd) with a high leucine (19.8 %) con¬ 
tent, possessing a hydrophobic N-terminal end. 
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The larger ORF has a coding capacity of 448 amino acids (49.4 Kd), 
corresponding to the N-protein molecular weight. The deduced protein 
possesses 43 serine residues (9.6 % of the total amino acid content) which 
may be phosporylated and involved in N-protein/RNA binding. N-protein 
also has 5 regions with a high basic amino acid content. One of them 
is also serine-rich and has a strong homology site with MHV, TGE and 
IBV viruses. In the first part of the N-terminal, a 12-amino-acid sequence 
(PRWYFYYLGTGP) is highly conserved for BECV-F15, JHM, TGE and IBY 
viruses. BCV Mebus strain and BECV-F15 have only minor differences in 
their N-protein sequence. 

Key words: Coronavirus, Protein, Nucleocapside, Genome; BECV-F15 
strain, N-protein sequence. 


INTRODUCTION 

Bovine enteritic coronavirus (BECV) belongs to the monogeneric Co- 
ronaviridae family having the avian infectious bronchitis virus as type species. 
They are pleiomorphic, enveloped, surrounded by a fringe of « club-shaped » 
spikes looking like a corona in the electron-microscope and giving the name 
to the family. The viral genome is a positive single-stranded RNA of appro¬ 
ximately 18 to 20 kb, its 3’-end is polyadenylated [19, 22]. This genome codes 
for the viral proteins which are nucleocapsid (N), membrane (El), spikes (E2) 
and several non-structural proteins. They are translated from a 3’-end co¬ 
terminal nested set of mRNA, each also having a common 5’-leader sequence 
[8]. Only the unique 5’-terminal sequence, not present in the next smaller RNA 
of the set, is translated. 

It was recently established that, in fact, BECV contains 4 main structural 
proteins: the nucleoprotein N (50 Kd), the transmembrane El glycoprotein 
(28 Kd) and 3 peplomer glycoproteins E2, gpl05 and gp95. The haemagglutinin 
protein E2 (125 Kd) is cleaved by reducing agents into 2 subunits having 
molecular weights of 65 Kd; the main neutralizing epitopes of the viral par¬ 
ticle are located on gpl05 (105 Kd) [9, 24, 6]; the structure of gp95 (95 Kd) 
is not clearly established. 

The BECV induces very severe, often fatal, diarrhoea in young calves. It 
was described for the first time in the United States of America [13]; we have 
been able to isolate such a virus in the faeces of diarrhoeic calves in France 
and to experimentally reproduce the disease [4]. These 2 strains of BECV are 
distinguishable by using monoclonal antibodies [23]. 


BECV = bovine enteritic coronavirus. 
BSA = bovine serum albumin. 

FCS = foetal calf serum. 


N = nucleocapsid. 

ORF = open-reading frame. 
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Vaccines produced from cell culture of attenuated or inactivated BECV 
are not totally protective and they necessitate production of large volumes 
of viral suspension because of the low infectious titre obtained in authorized 
cell lines. For these reasons, we have started cloning and sequencing the French 
FI5 strain of BECV to try and produce cheaper and more efficient vaccines 
by genetic engineering or by oligopeptidic synthesis. 


MATERIALS AND METHODS 

Cell culture and virus production. 

HRT18 cells (human rectal tumour cell line) were grown in RPMI-1640 medium 
containing 15 % foetal calf serum (FCS) [10] except that tylosine (10 jxg/ml) and 
lincomycine (200 gg/ml) were added to the medium instead of penicillin and strep¬ 
tomycin. 

Bovine enteritic coronavirus FI5 strain (BECV-F15) was isolated from diarrhoeic 
calf faeces, then directly adapted on HRT18 cells [10] and plaque-purified. It was 
grown as previously described [4], Infectious titres reached 5 x 10 5 plaque-forming 
unit (PFU)/ml. 


Virus purification. 

After freezing and thawing of infected cells together with supernatant and then 
clarification, the virus was purified by 2 ultracentrifugation steps (velocity then isopyc- 
nic) [9]. 


Genomic RNA purification. 

A 1-ml sample of purified virus suspension in distilled water was added to the 
same volume of 2-fold concentrated TNE buffer (20 mM pH 8 Tris-HCl, 200 mM 
NaCl, 2 mM EDTA) containing 400 gg of proteinase K. After incubation for 30 min 
at 37°C, then for 5 min at 50°C, a same volume of the same buffer containing 2 % 
SDS was added and incubation carried on for 30 min at 25°C. 

Genomic RNA was phenol/chloroform-extracted, then precipitated in 2.5 volumes 
of 0.25 M sodium acetate in ethanol. After one night at -20°C, RNA suspension 
was centrifuged for 20 min at 10,000 g, the pellet washed with 75 % ethanol, dried 
and dissolved in minimal volume of distilled water. One optical density (OD) unit 
at 260 nm corresponded to 40 fxg/ml of single-stranded RNA [12]. 


cDNA cloning. 

The synthesis of cDNA complementary to the 3’-end of the BECV-F15 genome 
was carried out in a volume of 52 [jd: 10 p.g in 10 (xl of BECV RNA, denatured at 
65°C for 5 min and quickly chilled in an ice bath, were added to 42 fxl of 100 mM 
pH 8.3 Tris-HCI at 42°C containing 100 mM KC1, 100 mM MgCl 2 , 10 mM 
dithiothreitol, 4 jig actinomycin D, 500 gM each of the 4 dNTP, 75 units RNasin, 
140 units reverse transcriptase (P.H. Stehelin), and as primer, 10 jig oligo-dT. In¬ 
cubation was performed for 2 h at 42°C and the reaction was stopped by adding 
2 fil 500 mM EDTA. Reaction products were extracted with phenol/chloroform, 
chloroform and ethanol precipitation. Free RNA strands non-hybridized with cDNA 
were digested with endonuclease T 2 [25]; these digests and free nucleotides were 
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removed by gel filtration on a spun column of « Sephadex-G50» medium (Pharmacia) 

[ 12 ]. 

The RNA-cDNA heteroduplexes were then poly-dC tailed: 2 pmoles of 3’ ends 
were dissolved in 20 jxl of 25 mM Tris-HCl buffer pH 7 containing 100 mM K- 
cacodylate, 0.2 mM DTT, 1 mM CoCL, 0.2 mM dCTP, 50 [xg bovine serum 
albumin (B.R.L.), 13.5 units of terminal-cfeoxynucleotidyl transferase (B.R.L.) and 
30 jxCi <x- 32 P-dCTP (3,000 Ci/mmole). The reaction was carried out at 37°C for 
3 min and stopped by adding 2 pd 500 mM EDTA [ 16 ]. The product was phenol/ 
chloroform-extracted. An average of 20 dC/3’-end of heteroduplex was obtained. 

C-tailed heteroduplexes were annealed to dG-tailed Pstl-linearized PBR322 plasmid 
(1 mole for 2 moles), in a volume where the plasmid was at a concentration of 5 ng/pd 
at 65°C for 10 min. Competent RR1 Escherichia coli cells were transfected with this 
material [5]. The total DNA concentration was 0.25 (xg/ml. 


Identification of specific BECV inserts. 

E. coli cells were grown overnight in a medium containing 12 pig/ml tetracycline, 
then treated by alkaline lysis [ 12 ]. Plasmidic DNA was extracted by phenol/chloroform 
treatment and ethanol-precipitated. DNA inserts were removed by Pstl restriction 
enzyme: 1.2 pd of 10-fold concentrated buffer (100 mM pH 7.5 Tris-HCl, 1 M NaCl, 
100 mM MgCl 2 , 1 mg/ml BSA) and 2 units of Pstl enzyme (B.R.L.) were added to 
10 pd of plasmidic DNA solution. Insert size was established by electrophoretic migra¬ 
tion in 1 % agarose gels in TBE buffer (89 mM Tris, 89 mM boric acid, 2 mM 
EDTA). 

Probes were prepared by nick-translation in a 20 pd volume containing 0.5 pig 
DNA, 2 pd of 10-fold concentrated buffer (500 mM pH 7.2 Tris-HCl, 100 mM 
MgS0 4 , 1 mM DTT, 500 pig/ml BSA), 20 piM each of the 4 dNTP, 2.5 ng pancreatic 
DNase I (Boehringer), 40 piCi a- 32 P-dCTP (800 Ci/mmole) and 0.8 unit DNA 
polymerase I. Mixture was incubated for 2 h at 16°C. Reaction was stopped by ad¬ 
ding 3 pd 500 mM EDTA pH 8. Free nucleotides were removed by filtration through 
a spun column. 

Northern and Southern blots were performed as described by Maniatis [ 12 ]. Pro¬ 
bes were incubated for hybridization overnight at 42°C (Southern) or at 55°C (Nor¬ 
thern) ; blots were then washed in low salt concentration solutions: three times for 
15minin0.1 % SDS, 2 x SSC and twice for 15 min in 0.1 % SDS, xO.l SSCat52°C. 


DNA sequencing and sequence analysis. 

M13 dideoxy sequencing was carried out according to the Sanger technique [17], 
using a- 35 S-dATP (New England Nuclear). In short, the main steps were the 
following: 

DNA replicative forms of mpl8 or mpl9 M13 phage were prepared [3]; they 
possess polylinkers with single cleavage sites for £coRI, SacI, Kpnl, Sma\, BamAl, 
Sail, Pstl, Sphl and Hindlll restriction enzymes. 

Viral cDNA inserts were extracted from PBR322 plasmid and treated by restric¬ 
tion enzymes having sites in the M13 polylinker. DNA fragments ranging between 
300 and 500 bases were purified by electrophoresis in low melting point agarose (Gibco- 
BRL) gel. M13 phage DNA was cleaved by the same enzymes and 5’ end phosphates 
removed by alkaline phosphatase (Boehring) treatment [ 12 ]. DNA were then 
phenol/chloroform-extracted and ethanol-precipitated. After ligation of the insert 
in the vector, performed with 50 ng of insert in a molar ratio of 3/1 TGI, E. coli 
competent cells were transfected [5]. 
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TGI recombinant clones were selected in a IPTG- and X-gal-containing medium. 
White plaques were then checked by hybrization with insert radioactive probe. 

Sequencing was then performed using a primer complementary to the 3’-end of 
the DNA strand to be transcribed. These primers were synthesized in an automated 
DNA synthesizer (Biosearch 8600). 

Sequence data were analysed and assembled with the aid of the program of Queen 
and Korn [ 14 ] of the « Beckman Microgenie » program (March 1985, version Beckman 
Instruments, Inc.) adapted to the «IBM PC-XT» microcomputer. 


RESULTS 


cDNA cloning. 

Starting material for cDNA synthesis was 10 p,g of purified and temperature- 
denatured viral RNA. When analysed by electrophoresis in alkaline agarose 
gels, the sizes of the cDNA obtained using oligo-dT as a primer ranged bet¬ 
ween 1.3 and 6.0 Kb. After binding of heteroduplexes to PBR322, this con¬ 
struction was transfected into E. coli-c ompetent cells and we obtained 2 x 10 5 
clones/pig of PBR322. 

Some 265 colonies containing 0.3- to 2.0-Kb inserts were studied. Inserts 
of a larger size than 0.5 Kb very often showed an internal Pstl site (results 
not shown). Their viral specificity was checked, after nick-translation 32 P- 
labelling, by hybridization with purified genomic viral RNA or cellular RNA 
(fig. 1). Viral-specific inserts were further used for characterization of other 
inserts. 

Insert orientation was established by hybridization with inserts having no 
tRsfl site and by restriction endonuclease mapping with enzymes having no 
or only one cleavage site in PBR322 plasmid. 

The location of the insert along the viral genome was determined by Nor¬ 
thern blot analysis: full length or purified products of insert restriction cleavage 
were hybridized with poly(A) + RNA extracted from infected or non-infected 
cells. Before hybridization these RNA were electrophoresed in hydroxymethyl 
Hg-containing agarose gel. Under these experimental conditions, 8 viral-specific 
poly(A) + messenger RNA bands were resolved (J. Laporte and C. Cruciere; 
to be published). They form a specific RNA-nested set as established for other 
coronaviruses. All the inserts we obtained hybridized with the 8 viral RNA 
bands (results not shown); they were complementary to the 3’ end of the viral 
genome. 

Figure 2 presents the schematic location of the inserts we have studied. 
The 1.6 insert has a 2,000-nucleotide size and the 5’-end of insert 2.56 is 
presumably 2,400 nucleotides from the 3’-end of the viral genome. As deduced 
from the sizes of N and El viral proteins, they should cover the whole length 
of the N gene (1,700 nucleotides) and the beginning of the El 5’-adjacent 
gene (320 nucleotides). 
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Fig. 1. — Screening of insert virus specificity. 

Radioactive probes were prepared from insert-containing PBR322 plasmid. These probes were 
hybridized on nitrocellulose sheets with dots of RNA extracted from non-infected (C) or 
BECV-F15-infected (V) HRT18 cells. Hybridization was checked by autoradiography. In 
the experiment shown, inserts 1.6, 1.22 and 2.56 were clearly virus-specific. 


cDNA sequencing. 

As mentioned above, 400-bp fragments of the cDNA clones were subcloned 
in mpl8 or mpl9 Ml3 phage DNA. Their nucleotidic sequences were deter¬ 
mined by sequencing both M13 DNA strands or by multiple sequencing of 
one strand. We have been able to establish a 1,710-nucleotide sequence from 
the 3’-end of the genome (fig. 3). This sequence has 2 overlapping open-reading 
frames (ORF). The main ORF stretches from nucleotide 74 to nucleotide 1,416, 




BECV-F15 CORONA VIRUS N PROTEIN SEQUENCE 


129 


BamHI 



Pst I 

I 


SacI SphI Pvull PstI 

111 I 



3 ' 

1.95 

1.79 

1.1 

1.4 

1.5 

1.6 
1.7 
1.12 

1.13 

1.14 

1.19 

1.20 
1.22 
1.26 
1.31 

1.38 

1.39 
1.4 7 
1.51 

1.56 
1.61 
2.19 

2.56 


Fig. 2. — Arrangement of some of the cDNA clones obtained using oligo-dT as primer. 


the smaller one from nucleotide 135 to nucleotide 755 (fig. 3). The first has 
a coding capacity for a 448-amino-acid protein, the second for a 207-amino- 
acid protein (fig. 4). 


DISCUSSION 

We have determined, by cDNA cloning of BECV-F15 genomic RNA using 
an oligo-dT primer, a sequence of 1,710 nucleotides. 

We assume that this sequence comprises the nucleocapsid protein gene se¬ 
quence. 

For every coronavirus so far studied, the gene coding for the N protein 
is located at the 3’-end of the viral genome. The same conclusion arises from 
our studies on the BECV-F15 poly(A) + RNA (to be published). 
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The largest ORF has a 1,344-nucleotide length and encodes for a 448-amino- 
acid protein with a molecular weight of 49.4 Kd. Our previous results [4] had 
shown a 50-Kd molecular weight N protein. 

Recently [11] it was described for the US Mebus strain of the related bovine 
corona virus (BCV), that the N protein gene was at the 3’-end of the viral 
genome. 


Open-reading frames. 

Main ORF. — The distance between the first AUG following the initia¬ 
tion codon and this initiation codon is 693 nucleotides. When we compared 
the sequence around the initiation codon to homologous sequences of dif¬ 
ferent strains of MHV we found the same CTAAAC sequence upstream of 
the initiation AUG. 

Secondary ORF. — The consensus sequence GUAAUGGC surrounding 
its initiation codon is one of optimal environment for starting mRNA transla¬ 
tion [7], Bunyaviruses and adenoviruses express 2 different proteins from only 
one gene by having 2 overlapping ORF [7]. So, we cannot exclude the transla¬ 
tion of a protein from the secondary ORF. Its predicted molecular weight 
is 22.9 Kd for 207 amino acids. This protein has a rather high leucine con¬ 
tent : 19.8 % compared to 5 % for the N protein. Furthermore, its N-terminal 
end is hydrophobic and is a potential membrane anchor region. Genes presen¬ 
ting 2 different ORF are also described for other coronaviruses: mRNA 5 of 
JHMvirus [20], mRNA D of IBY [2] and N protein mRNA of the Mebus BCV 
strain [ 11 ]. 


Non-coding 3’-end. 

This part of the genome may play an important role during the genomic 
RNA trancription to the complementary minus RNA strand. Sequence 
homology between BECV-F15 and MHV for the last 100 nucleotides of the 
coding part is only 59 %, but homology increases to 75 °7o for the 3’-non- 
coding end. A 10-nucleotide sequence (GGGAAGAGCT) was found in com¬ 
mon at the same place of this gene area for MHV and IBV viruses [2] (fig. 5). 
We find an identical sequence (except the last T) for BECV-F15 virus be¬ 
tween nucleotides 1,631 and 1,640. When looking at the GETV genome se- 


Fig. 3. — Nucleotidic sequence of the 3’-end of BECV-F15 genome. 

This 1,710-nucleotide sequence has 2 large overlapping ORF. M = potential translation initia¬ 
tion codons; U = translation stop codons. — = main ORF; — = secondary ORF. 
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74 


AUCUCACCCAUUCCCUCCCUCCAUCCCCUUCACUGAUCUCUUCUUACAUCUl/UUUAtMAUCUAAACUUl/AACC 
HU U 

U 

U U U 

AUGUCUUmJACUCCUCCUgACCAAUCCACUjJCUjjCACCCUCCUCUCCAAAUCCUUCUCCUAAUCCCAUCCUUA 

M--- 


147 


ACUCCCCCCAUCACUCCCACCAAUCUACAAAUCUUCAAACCACCCCUACAACACCUCAACCCAACCAAACUCC 
U U 


-M- 


220 UACUUCUCAGCAACCAUCACCACCCAAUCUUCUACCCUACUAUUCUUCCUUCUCUCCAAUUACUCACUUUCAA 


293 


AAGCGAAAGCAGUUUGAAUUUGCUCAGCCACAACCUCUCCCUAUUCCACCAGCAGUCCCACCUACUCAACCUA 
U U u u 


366 


AGGCCUACUGCUACACACACAACACACCUUCUUUUjJAAACACGCCAUGGCAACCACCGUCAAUUGCUCCCACC 


439 


AUCCUAUUUUUACUAUCUUCGAACACGACCCCAUCCCAAACACCAGUAUGCCACCGACAUUGACGCACUCUUC 
M U 


512 


UCCCUCCCUjAGU^ACCACCCUCAUCUCAAUACCCCGCCUCACAUUCUCCAUCCGCACCCAAGUAGCCAUCACG 


585 CUAmJCCCACU^GGUUUCCGCCUGCCACCGUACUCCCUCACCCUUACUAUAUUGAACCCUCACCAACCUCUGC 


658 


UCCUjydIUCCAGAUCUACUUCACCCCCAUCCAGU|JCAGCCUCUjJGUCCACGAUCCCGU^GUJGACCC. 


AAUUCU 


731 


CGCAAUACAACCCCUACCUCUCGUCUAACACCUCAUAUGGCUGAUCAAAUUGUCAGUCUUCUUUUCCCAAAAC 


804 


UUCCCAACGAUGCCACU|JACCCACACCAACUAACU|JAGCACACUCCCAAAGAAAUCAGACACAAAAUUUUCAA 


877 


UAACCCCCGCCACAACACCACCCCCAAUAAACAAUGCACUCUUCACCACUCUUUUCGGAAGACAGCCCCCAAU 
U U M 


950 


CACAAUUUUCCUGGUGCAGAAAUCmJAAAACUUGGAACUj|CUGACCCACACUUCCCCAUUCUUGCAGAACUCC 

' ■ H— — - ■ - ■■■■■■ 


1023 CACCCACAGCUGCUGCCinnJUUCUUUCGAUCAAGAUUAGAClIUGGCCAAACUCCAGAAUUUCUCUCGCAAUCU 


1096 


UGAUGACCCCCAGAAGGAUGUUUAUGAAinJACCCUACAAUGCCGCAAUUACAUUUGAUjJGUACACUUUCACCU 


1169 


1242 


1315 


1388 


1461 


1534 


1607 


16B0 


mJUGACACCAUAAUCAACCUGUUCAAUGACAAUUUCAAUCCAUAUCAACAACAACAUCCUAUCAUCAAUAUGA 

->1---—— 1 ■- ■ — --H—M-M— 


U U 


U M 


U H 


GUCCAAAACCACACCCUCACCGUCGUCAGAACAAUCCAC AACGAGAAAAUG AUAAUAUAAGUCUUCCAGCGCC 


M 


U 


CAAAACCCGUCUCCACCAAAAUAACACUAGACACmjCACUGCACAGCACAUCAGCCUUCUUAACAACAUCCAU 

- - --- - -H_ 

U H 

CACCCCUAUACUCAACACACCUCAGAAAUAUAAGACAAUCAACCiniAUCUCCCUACCUCCUGCCAACCCCUCG 
U U u 

-y 

H M 

CACCAAACUCCCCAUAACCCAUUCUCUAUCACAAUCCAUCUCUUCCUCCUAUAAUAGAUACAGAACCUUAUAC 
U H U U 

M V V 

CACACUAU ACAUUAAUUACUUGAAAGUUUUCUCUCCUAAUCUAUACUCUUCCAGAAACliCAAACACUUCCCCA 
U u 

u U u H U 

ACU AAUUCCCCACAAGUCCCC AACCCCAAC AGCCAGCAUCUU AACUUCCCACCCAGUAAUUACUAAAUGAAUC 
U U U , U 

U U M 

H U H 

AACUUAAUUAUCCCCAAUUCCAAGAAUCACA 

U 
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quence [15], we observe the same sequence (except the first G) in the 3’-non¬ 
coding end between nucleotides 1,923 and 1,931. Our analysis strengthens 
BournselPs hypothesis; this sequence, well conserved among the coronaviridae 
family, should have an important function during RNA replication as it is 
an RNA-polymerase fixation site. 


N-protein-predicted amino acid sequence. 

BECV-F15 N protein has very strong homology with the same protein of 
JHM virus (70.3 %) (fig. 6) and only 25.2 % and 24.1 %, respectively, with 
TGE and IBY virus N proteins. These coronavirus N proteins are 
phosphorylated on their serine residues [18]. Our results show 43 serine residues 
in BECV-F15 nucleocapsid protein (9.6 % of the total amino acids). For this 
virus and for JHM, TGE and IBV viruses we find 2 main areas where serine 
residues are clustered. For BECV-F15 and JHM viruses they are in homologous 
areas (nucleotides 9 to 19 and nucleotides 191 to 220) of low overall homology 
(58 % and 53 %). One serine cluster is common to the 4 viruses. This fact 
is striking because of the low sequence homology between these viruses. 

It was previously established [21,1] that N protein genomic RNA binding 
sites are located in the basic portions of the protein. For the complete se¬ 
quence there is an excess of 19 basic residues compared to acidic residues. 
There are 5 basic-rich regions which are found in homologous areas of MHV, 
TGE and IBV viruses. Concerning BECV-F15 and MHV, 4 of these areas 
have 90 % homology. The fifth has only 60 °7o homology but is also serine- 
rich and possesses a sequence in common with TGE and IBV viruses (amino 
acids 193 to 222). It may have a more specific function in protein/RNA 
recognition. 

We also observed a strong sequence homology, not yet described, in the 
first part of the N-terminal end of the N proteins of BECV-F15, BCV, MHV, 
TGEV and IBV viruses: 


Virus 

Amino acid nb 

Amino acid sequence 

BECV-F15 

118 to 134 

QLLPRWYFYYLGTGPHA 

JHMV 

121 to 135 

QLLPRWYFYYLGTGP 

GETV 

89 to 101 

RW FYYLGTGPHA 

IBV 

91 to 102 

WYFYY GTGP A 


This sequence has no peculiar properties: 9 hydrophilic and 8 hydrophobic 
residues. The biological significance of these findings is not known. 

In conclusion, we have noticed that there are only minor changes between 
BECV-F15 and BCV Mebus strain N proteins. Work is in progress to sequence 
the other virus genes and to find out how similar in fact these two last viruses 
are. Because of the antigenic differences established by monoclonal antibody 
screening, the specificities should be found on the gene coding for the spike 
gpl05 protein. 
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AUGUGAGCGAUUGCGUGCGUGCAUCCGCUUCACUGAUCUCUUGUUAGAUCUUUmJAUAAUCUAAACUUUAAGG 
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MSFTPGKQ SSSRASSGNRSGNGIL 
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LLLSNHQEGMLYPTILGSLELLSFK 
TSOQ PSGGNVVPYYSWFSG ITQ FQ 
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RERSLNLXRDKVCLLHQESQ LLKL 
KGKEFEFAECQGVPIAPGVPATEA 
aaggcaaaggaguuugaauuugcugagggacaaggucugccuauugcaccaggagucccagcuacugaagcua 

RGTGTDTTDVLLKHAMATSVNCCH 
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DGIFTILEQ DRHPKTSMAPTLTESS 
WYFYYLGTGPHAKDQYGTDIDGVF 
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GSEVTRLMSIPRLTFSIGTQVAMR 

WVASNQADVNTPADILDRDPSSDE 

ugggucgcuaguaaccaggcugaugucaauaccccggcugacauucucgaucgggacccaaguagcgaugagg 

LFRLGFRLARY SLRVTILKAQ EGL 
AIPTRFPPGTVLPQGYYIEGSGRSA 
CUAUUCCGACUAGGUUUCCGCCUGGCACGGUACUCCCUCAGGGUUACUAUAUUGAAGGCUCAGGAAGGUCUGC 

llipdllhahpveplvqdrvvepil 

pnsrstsrassrassagsrsrans 

UCCUAAUUCCAGAUCUACUUCACGCGCAUCCAGUAGAGCCUCUAGUGCAGGAUCGCGUAGUAGAGCCAAUUCU 

aieplplv 

GNRTPTSGVTPDMADQ IVSLVLAK 
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LGKDATKPl 
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0 Q VTKQ TAKEXRQ KILN 
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KPRQKRSPNKQCTVQQCFGKRGPN 

uaagccccgccagaagaggagccccaauaaacaaugcacuguucagcaguguuuugggaagagaggccccaau 

Q NFGGGEMLKLGT c p p q FP I X A E L 
cagaauuuuggugguggagaaauguuaaaacuuggaacuagugacccacaguuccccauucuugcagaacucg 

APTAGAFFFGSRLELAKVQNLSGNL 

CACCCACAGCUGGUGCGUUUUUCUUUGGAUCAAGAUUAGAGUUGGCCAAAGUGCAGAAUUUGUCUGGGAAUCU 

DEPQKDVYELRYNGAIRFDSTESG 

UGAUGAGCCCCAGAAGGAUGUUUAUGAAUUACGCUACAAUGGCGCAAUUAGAUUUGAUAGUACACUUUCAGGU 

FETIMKVLNENLNAYQQQ DGMM NM 
UUUGAGACCAUAAUGAAGGUGUUGAAUGAGAAUUUGAAUGCAUAUCAACAACAAGAUGGUAUGAUGAAUAUGA 

SPKPQRQRGQKNGQGENDNISVAAP 

guccaaaaccacagcgucagcguggucagaagaauggacaaggagaaaaugauaauauaaguguugcagcgcc 

KSRVQQNKSRELTAEDISLLKKM D 

caaaaccccugugcagcaaaauaagaguagagacuugacugcagaggacaucagcctjucuuaagaacauggau 
EPYTEDT SE X 

gagcccuauacugaagacaccucagaaauauaagagaaugaaccuuaugucgguaccuccucccaaccccucg 

cagcaaagucgggauaaggcauucucuaucagaauggaugucuuccugcuauaauagauagagaagguuauag 

cagacuauagauuaauuaguugaaaguuuugugugguaauguauaguguuggagaaagugaaagacuugcgga 

aguaauuccccacaagugcccaaggggaagagccagcauguuaaguugccacccaguaauuaguaaaugaaug 


aaguuaauuauggccaauuggaagaaucaca 


Fig. 4. — Amino acid sequence of the proteins predicted from the main and secondary ORF. 
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RESUME 

SEQUENCE ET ANALYSE DU GENOME 
DU CORONAVIRUS ENT£R1TIQUE BOVIN (F15) 


I. — Sequence du gene codant pour la proteine nucleocapsidique; 
analyse de la proteine deduite 

Nous avons clone l’ARN genomique du coronavirus enteritique bovin FI5 
(BECV-F15), dans le plasmide PBR322 apres avoir prepare le cDNA corres- 
pondant k l’aide d’une amorce oligo-dT: 265 clones ont ete etudies. Leur hybri¬ 
dation avec les ARN poly(A) + extraits des cellules infectees nous a permis 
de les localiser k l’extremite 3’-terminale du genome. 

Ces clones ont ete sequences par la technique de Sanger, apres sous-clonage 
dans l’ADN du phage M13. Nous avons determine une sequence de 
1.710 nucleotides correspondant au gene codant pour la proteine N virale. 
Elle presente deux cadres ouverts de lecture (ORF) chevauchants. On observe 
k l’extremite 3’-terminale non codante du genome une sequence de 8 nucleo¬ 
tides observ6e egalement dans la region homologue des virus MHV, GET et 
IBV. Cette sequence pourrait etre le site de fixation de l’ARN polymerase. 

Le premier AUG du plus petit ORF possede en amont une sequence nucieo- 
tidique qui en fait un site d’initiation potentiellement fonctionnel. La sequence 
du produit primaire de traduction que l’on en d6duit est un polypeptide de 
207 acides amines (22,9 Kd) a haute teneur en leucine (19,8 %) ayant une extre- 
mite N-terminale hydrophobe. 

Le plus grand ORF a une capacite de codage de 448 acides amines (49,4 Kd), 
correspondant k la masse moleculaire de la proteine N. La proteine deduite 
contient 43 residus serine (9,6 °7o des acides amines), qui peuvent etre phos- 
phoryles et impliques dans la liaison entre la proteine N et l’ARN genomi¬ 
que. Cette proteine presente egalement 5 regions fortement basiques, et l’une 
d’entre elles est egalement riche en serine et a une forte homologie de sequence 
avec la region homologue des prot6ines N des virus MHV, GET et IBV. En 
outre, la premiere partie de I’extr6mit6 N-terminale montre un enchaTnement 
de 12 acides amines (PRWYFYYLGTGP) tres conserve entre ces quatre meme 
virus. 


Fig. 5. — Nucleotide sequence homology between the 3’-end of the genomes of BECV-F15 

and MHV-JHM viruses. 




BECV-F15 CORONA VIRUS N PROTEIN SEQUENCE 


i 

i 

63 
73 
132 
145 
195 
217 
267 
286 
339 
358 
411 
430 
483 
502 
555 
574 
627 
646 
699 
714 
771 
784 
84 3 
856 
915 
928 
987 
1000 
1059 
1073 
1131 
1138 
1203 
1210 
1272 
) 282 
1338 
1354 
1397 
1426 
1466 
1498 
1538 
1570 
1605 
1641 
1670 
1723 


AUCUCACCCAUU GCCUCCGU CCAUCCCCUUCACU GAUCUCUU CUUACAUCUUUUUAUAAUCU 

ii hi mi mi in i j ii n in i inn in i i i min 

UAUAAGACUCAUUGCCGUCCCUACCUACCCUCUCUACUCUAAAACUCUUCUACUUUAAAUCUAAUCUAAUCU 

AAACUUUAACCADCUCUUUUACUCCU CCUAAC caau ccacuacuagacccuccucuccaaauccuucuc 
iniinininnnm nn n hi iii n nn m iniunin ii m 
aaacuuuaacgaucucuuuucuuccuccccaacaaaaucccgcuaccacaaccuccucuccaaaccccccuc 

guaaugccauccuuaag ugcccccaucagucccaccaaucuacaaaucuucaaaccacccgua 

iiiiiii inn hi inn ii ii nn i un inn ii n i 

cuaauccaauccucaacaacaccacuuccccucaccaaacccacccccccuuaaauaaucaaaauacaccca 

caacaccucaacccaaccaaacuccuacuucucaccaaccaucaccacccaaucuuguacccuacuauucuu 
in in mum mn iii ii mu nn m n in i n nn 
caaagaaucaccccaaccacacuccaacuacu caacccaauucccccacucuccuuccccauuacucuu 

ccuucucucgaauuacucacuuucaaaagcgaaaccacuuugaauuuccucacccacaaccucucccuauug 
nn ii ii inn ii ii ii mum mm i mn i nniin iiiiimn 
cguuuucccccauuacccaauuccagaacccaaaacaguuucacuuuccacaaccacaacgacucccuauuc 

CACCACCACUCCCACCIMCUCAACCUAACCCCUACUCCUACACACACAACACACGUUCUUUUAAAACACCCC 

i iii iiiiim i i imi n miiinnnm iiiiiii nimiiii i 
ccaaucgaaucccaccuucacagcaaaaccgauauuccuacagacacaacccaccuuccuuuaaaacaccuc 

aucccaaccacccucaauuccucccaccauccuauuuuuacuaucuuccaacaccaccccaucccaaacacc 
mn i in ii i mn unmimiimmumim ii nn i 
aucgccagcacaagcagcuacuccccacaugcuauuuuuacuaucuucgaacagcccccuaugcucccccac 

acuaucccaccgacauucacccacucuucuccgucgcuacuaaccacccucaucucaauacccccccucaca 
iiiniii hi ii n inn minim u i mn n i in i ii n i 

ACU AOGGCG ACG AU AUCC A ACC AC UUC UCUCCC UCGC AACCCAAC ACCCCC ACACU ACCACCUCUCCCC AUA 

uucucgaucccgacccaacuagccaucacccuauucccacuacguuucccccucccacccuacucccucacc 

ii i ii ninmmin nimnmii mnmi mi n nun i mu i 
uucuugaaaccgacccaacuacccaugaggcuauuccuacuaccuuuccccccccuacccuauucccucaac 

guuacuauauucaacccucacgaaccucugcuccuaauuccacaucuacuucacccccauccacuagacccu 

iii in munnimnmuii hi i inn ii mi in i i i 
cuuuuuauguucaagccucaccaaccucuccaccuccuacuccaucuccuu cccc cccacaauccccu 

cuacuccagcauccccuaguacacccaauucucccaauacaaccccuaccucucgucuaacaccugauaucg 
i mm nn u in i i i in mn mn nnuinii 
gccccaaauaaucccc cuacaaccaguuccaaccaccgccacccucccucuacucuaaaaccucauaucg 

cucaucaaauucucacucuucuuuucccaaaacuucccaagcaucccacuaacccacaccaacuaacuaacc 
i ii nmi minium u n n ii mm un nimiiii nn 
cccaacaaauuccuccucuucuuuucccuaaccucccuaaacaucccccccacccuaaccaacuaacaaacc 

acacugccaaagaaaucagacacaaaauuuucaauaaccccccccacaagacgacccccaauaaacaaucca 
i i iimiim mi iimuiin n inn n u iiiiiii n ii u u m 
aaacucccaaagaagucacccacaaaauuuuaaacaagccuccucaaaacaccacuccaaacaaccacuccc 

cucuucaccacucuuuuccgaacacaggccccaaucagaauuuuccucguccacaaaucuuaaaacuuccaa 
i i uimumminnuumuuunnnm ii iiuniinuniini 
cacuccaccacucuuuuccaaacacaccccccaaucacaauuuuccagccccucaaaucuuaaaacuucgaa 

cuacucacccacacuuccccauucuugcacaacucgcacccacaccuccucccuuuuucuuuccaucaacau 
iiiiiii uiuniimmimim i ii u niinuni n imniumi u 
cuacucauccacaguuccccauucuuccagacuuggccccaacagcuccucccuucuucuuuccaucuaaau 

uacacuuccccaaacuccagaauuugucugccaaucuucaucacccccacaacgaucuuuaugaauuacccu 
nn nn nn mi mu i mu iii ii mu mu i i i 

UACAAUUGCUCAAA AAGAA CUCUCGUGCUGCUGAUGGACCCACCAAACAUCUGUAUCACCUGCAAU 

acaaucccccaauuacauuucauacuacacuuucaccuuuucacaccauaaucaaccucuucaaucacaauu 

i n in imuiiimmi u i iiiunmi n inn iiiiniuniiiii 

auucaccuccacuuacauuucauacuacucuaccuccuuuucacacuaucaucaaacucuucaaucacaauu 

ucaauccauaucaacaacaacauccu aucauca aua ucacuccaaaaccacacccucacccuccucaca 
iiiiiii ii ii i iiniini n ii ii nn n n n iii i u i n i 
ucaaucccuaccacaaucaacaugcucguccacauguacucaccccuaacccucacacaaacacacccacaa 

AG aaugg acaacca caaaaucauaauauaacuguuccaccccccaaaaccccuguccaccaaaaua 

ii ii ii i n u iii mm mi mm nimiiii inniii mi 

accaaaacccucacaaacaucaacuacauaaucuaacccuuccaaaccccaaaaccucucuccacccaaauc 

acacuacacacuucacuccagacgacaucacccuucu uaaga acaug caucagcccuau 

mmmii n i mu nnnm i iii mn i i in n 

uaacuacacacuuaaccccucagcaucccacccuucucccucagauccuagaucaucgccuacugccacauc 

ACU gaacacaccucacaaauau aacacaaucaaccuuaucuccccaccuccugccaaccccucccacca 
i mu iii i ii minimi i iiimnm mu minim n 

ccuuagaacaucacucuaaucucuaaagagaaucaauccuaucuccccacucccuccuaaccccuccccaca 

aagucgcgauaacccacucucuaucacaauccaugucuuccugcuauaauagauagacaaccuuauaccaga 
iminini i iuiuinuniniinmiinu nn uimunnn i mu 
aacuccccauaggacacucucuaucacaauccaucucuuccugucauaacagauagacaaccuugugccaca 

cuauacauuaauuaguucaaac uuuugucucguaauguauacucuucgacaaagu g aaacacuugcc 
i i ii miminiii in ii i ii mi mu i i ii i i i ii 
cccucuaucaauuacuucaaacacauuccaaaauagagaau gucugagacaaguuaccaaccuccuacguc 

CAA CUA auugcccacaacuccccaaccccaacacc caccaucuuaaguugccacccacuaauuag 
ii ii i i i i i i mi nuuiii u h i ii iiiii iii 

uaaccauaacaaccgcgauacccccccccucccaagaccucacaucacccuacuauuccuccaauccccuac 

uaaaucaaugaacuuaauuauggccaauucgaagaaucaca 

iimimiuiii i iiininiiiunuuu 

uaaaucaaugaacuugaucaucgccaauugcaagaaucac 



136 


C. CRUCIERE AND J. LA PORTE 


1 MSFTPGKQ SSS RASSG 

***** **** 

1 MSFVPGQENAGSRSSSG 

34 RNVQ TRGRRAQ PKO TAT 

* * *** ******* 

38 LNNQ NRGRKNQ PKQ TAT 

71 QFQKGKEFEFAEGQGVP 
******** ** ***** 

74 QFQKGKEFQFAQGQGVP 
108 SFKTRDGNQ RQ l L P R W Y 

* * * * * * A ******* 

111 SFKTPDGQQKQ L L P R W Y 

145 VFUVASNQADVNTPADI 

* * * * * * * * * * * 

148 VVWVASQQAETRTSAD1 

182 LPQGYYIEGSGRSAPNS 

**** * ******** * 

185 LPQGFYVEGSGRSAPAS 

219 SGNRTPTSGVTPDMADQ 
* * * * * * * * * 

220 snqrqpastvkpdmaee 

256 KQTAXEIRQKILNKPRQ 
* * *** ********** 

257 KQSAKEVRQKILNKPRQ 

293 QNFGGGEMLKLGTSDPQ 
***** *********** 

294 Q NFGGPEMLKLGTSDPQ 

330 ELAKVQ NLSGNLDEPQ K 
* * * * * * * 

331 ELVKKNS G G A DGPTK 

367 ETIMKVLNENLNAYQQQ 

*************** * 

366 ETIMKVLNENLNAYQ NO 

402 GQGEN DNISVAAPKSR 

* * * *** *** 

403 AQKDEVDNVSVAKPKSS 

437 MDE PYT EDTSEI 

* * * * * 

440 LDDCVVPDCLEDDSNV 


NRSGNGILK WADQSDO 

* * ****** **** 

NRAGNG1LKKTTWADQTER 

SO Q PSGGNVVPYYSWFSGI 
* * * * * * ******* 
TQ PNSGSVVPHYSUFSGI 

IAPGVPATEAKGYWYRHNR 
** * ** ********* 

I A N G I PASQQ KGYVIYRHNR 

FYYLGTGPHAKDQYGTDID 
*****/*** * ** ** 
FYYLGTGPYAGAEYGDDIE 

LDRDPSSDEA1 PTRFPPGT 
***** ******* *** 

VERDPSSHEAIPTRFAPGT 

RSTSRASSRASSAGSRSRA 
**** ** ** 

RSGSRPQ SRGPNN RARSS 

IVSLVLAKLGKDATKPQQVT 
* ********** * * * * 

IAALVLAKLGKDAGQ PKQVT 

KRSPNKQCTVQQCFGKRGPN 
** ***** *********** 
KRTPNKQCPVQQCFGKRGPN 

FPILAELAPTAGAFFFCSRE 
****************** * 

FPILAELAPTAGAF.FFGSKL 

DVYELRYNGAXRFDSTLSGF 
***** * ** ****** ** 

DVYELQY SGAVRFDSTLPGF 

DGMMNM SPKPQRQRCQKN 
** ****** ** * 

DGGADVVSPKPO RKRGTKQK 

VQQNKSRELTAEDISLLKK 
** * ***** ** *** 
VQRNSVRELTPEDRSLLAO 1 


Fig. 6. — Amino acid sequence homology of the N proteins of BECV-F15 
and JHM-MHV viruses. 
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Les sequences des proteines N de la souche Mebus du BCV et du BECV- 
F15 ne presentent que des differences mineures. 

Mots-clEs: Coronavirus, Proteine, Nucleocapside, G6nome; Souche 
BECV-F15, Sequence de la proteine N. 
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